Skip to main content

Full text of "BSTJ 42: 6. November 1963: Spectral Characteristics of Digit-Stimulating Speech Sounds. (Borenstein, D.P.)"

See other formats


Spectral Characteristics of Digit- 
Simulating Speech Sounds 

By D. P. BORENSTEIN 

(Manuscript received July 11, 1963) 

A spectral analysis has been performed on a number of spoken vowel 
sounds, in particular those sounds causing digit registration in a TOUCH- 
TONE receiver. The analysis, implemented by computer methods, provides 
a definitive picture of the nature of digit simulation in TOUCH-TONE 
calling. 

I. INTRODUCTION 

A digit simulation in TOUCH-TONE calling (Ref. 1, pp. 9-12, 15-16) 
is, by practical definition, a speech segment capable of causing digit 
registration in a TOUCH-TONE signaling system. Spectral analyses 
have been performed on a number of speech segments, each of which 
was selected solely on the basis of having the above property. Briefly, 
a valid TOUCH-TONE signal requires the simultaneous presence of 
two code frequencies for a certain minimum length of time, and with 
some minimum signal-to-noise ratio. It was therefore theoretically an- 
ticipated (Ref. 1, pp. 10-12) that each of these speech segments would 
be linked by two other common characteristics: (1) a frequency spectrum 
having two sharply dominant peaks, and (2) a high degree of periodicity 
for some minimal length of time. 

There is good reason to believe that speech segments of this general 
nature are likely to be troublesome in any signaling system based on the 
transmission of voiceband tones over speech channels. 

Due to the inherent rarity and relatively brief duration of the voice- 
produced digit simulation, some special procedures were required both 
in obtaining and analyzing these speech segments. The remainder of this 
article comprises a description of these procedures, followed by a 
presentation and discussion of the resulting spectral analyses. 

2839 



2840 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1963 




10-KC 
FREQUENCY 
STANDARD 



Fig. 1 — Apparatus for recording digit-simulating speech segments. 



II. COLLECTING THE SPEECH SAMPLES 

The digit-simulating speech segments were obtained by recording 
raw speech onto magnetic tape loops with the two-track recording 
arrangement shown in Fig. 1. 

Using a GO-inch loop of tape at a speed of 15 in/sec, speech is continu- 
ously recorded at point A, played into a standard TOUCH-TONE re- 
ceiver at point B, and, if there is no receiver output, erased at point E 
after traversal of the loop. Simultaneously, on a second track, a 10-kc 
pilot frequency is continuously recorded and erased at points B and C, 
respectively. If at any time there is a TOUCH-TONE receiver output, 
indicating the presence of a digit-simulating speech segment just past 
point B, the timing network is triggered. The timing network then 
performs two operations: (1) it disables the 10-kc record and erase after 
a delay of 35 ms, and (2) it stops the tape transport after a delay of 2 
seconds (half the loop traversal time) . 

This process yields a 60-inch length of tape consisting of about 29 
inches each of pre- and post-simulation speech plus a 1.5-inch (110 ms at 
15 in/sec) segment which contains both the actual simulating speech 
sample and the 10-kc pilot frequency. In this manner, fourteen such 
samples were obtained, at the average rate of about one per ten hours of 
raw speech — an indication of the extreme rarity of simulation with the 
present TOUCH-TONE receiver. 

III. ANALOG-TO-DIGITAL CONVERSION AND PRINTOUT 

By means of encoding equipment developed by the Acoustics Research 
Department, the fourteen digit-simulating speech segments were con- 



DIGIT-SIMILATIXG SPEECH SOUNDS 



2841 



verted from analog form to an eleven-bit digital signal. The sampling 
rate of 10 kc was gated directly from the pilot track of the original analog 
tape, thus eliminating sources of error due to tape flutter during the 
original recording process. Once the digital tape was obtained, the 
conversion process was reversed to obtain an accurate X-Y recording of 
each of the fourteen speech waveforms. Visual inspection of these 
waveforms, two of which are shown in Fig. 2, confirms their periodic 
nature (the periodicity of the samples shown in Fig. 2 would be still 
more evident were it not for the fact that most speech fundamentals are 
considerably attenuated by telephone apparatus). 

IV. SPECTRAL ANALYSIS 

The fourteen speech samples, in eleven-bit digital format, were then 
subjected to a "pitch synchronous" 2 Fourier analysis on the IBM-7090 
computer. The pitch synchronous analysis consisted essentially of a 
conventional Fourier analysis performed on each successive funda- 
mental pitch period in the speech sample. These pitch periods, in turn, 
were determined on the computer by counting the number of sampling 
intervals (each being 100 /usee) between successive maxima in the wave- 
form and then interpolating between samples for greater accuracy. This 
method of Fourier analysis is ideally suited to waveforms that maintain 
an almost-periodic structure over an appreciable length of time. 

For each speech segment analyzed, the computer output consisted of 
a sequential set of bar graphs, one for each fundamental pitch period of 
the speech waveform. Each graph, in turn, is a plot of harmonic ampli- 
tude (the Euler coefficient) in db versus harmonic number. In addition, 
each graph gives the "instantaneous pitch" (i.e. the reciprocal of the 
period) of each fundamental period analyzed. Figs. 3 and 4 show the 




TIME 



IMS 



Fig. 2 — Analog waveforms of two digit-simulating speech segments. Also shown 
are the sex of each speaker and Hie particular phoneme causing the simulation. 
Fourier spectra of digit simulations 1 and 2 are shown in Figs. 3 and 4, respectively. 
Arrows indicate periodicity, with the large arrow showing the approximate start 
of the digital simulation. 



1 


(a) 170.30/ 


1 
1 


I 
X 


X 


X 


X 
X 


X X 


X 

XX 


X X 
X X 


XX 


X XX 


XXI 

XXX 


X XXX 

X XXX 


xxxx 


X XXX 


xxxx 


xxxxx 


xxxx 


XXXXX X 


■xxxx 

xxxxx 


XXXXX X 
XXXXX X 


xxxxx 


XXXXX X 


IXXXX 


XXXXX XX 


xxxxx 


XXXXX XX 
XXXXXXXI 


>x«xx 


IXXXXXXX 


XXXXX XXXXXXXI 


xxxxxxxxxxxxxx 


imiimixxixx 
xxxxxxxxxxxxxxxx 


IXXXXXXXXXXXXKXXX 

xxxxxxxxxxxxxxxxx 


xxxxx 
xxxxx 
xxxxx 
xxxxx 
xxxxx 


xixxxxxxxxx 


milium 


xxxxxxxxxxx X 


xxxx xxxxxxxxxxxxxxxxx 


xxxxxxxxxxxxxxxxxxxxx 

IXXXXXXXXIXXXIXXXXXXI 


XXXXXI 


xxxxxxxxxxxxxxxx 

xxxxxxxxxxxxxxxx 


xxxx.x 


xxxxmxxxmm 
xiximmmm 


■ mi 


mimmmm 


1 1 


1 1 1 



I (e) 


72.2 O 


I X 


X X 


X X 

I X 


X X 


X X 
X X 


XX X X 


XX X X 

XX X X 


XX X X 
XX X X 


XXX XX X 
XXI XX XX 


XXXX XX XIX 
xxxx XXXXXI 


xxxii mill 

urn xxxxxx 


ixxxx ximi 
xxmiixmi 


imimixxx 
xxmmmi 


mxiimxxx 


IXXIXIXXXXXX I 
XXXXXXXXXXIX X 


iixxxxxxmmx 
imxxmxxxiii 


X 


xxmmxmm 
xmmmxxiix 


IX 


xxmmxmm 


ii 


xxmmxxmmxx 


ixxxxmixmimi 

iiixmxiimxxm 


ixxxixmxmixiix 
xxiximixxmim x 


immximmiix i 
ixxxmmmxxm i 


xmmmmxmi i i 
immmmmxm x 


xxxx mxxxxx xxxxxx mi 
mi xx mmmx xi i m 


immiiimxxmmx 
xmmiixmmmm 


xxxx mxixiimm xxxx 
xmixmmmmxxxx 


xximmmixmiixxx 


1 5 10 15 


20 25 



i 

X 


(b) 172 


.1 -u 


X 

i 


X 

i 


1 X 




X 

X 


X X 

X I 




X 

II 


I I 

X I 




XI 
XX 


I I 

I X 




IXI 


X XI 

X XXX 




XIII 

XXXI 


X III 
I IXI 




IIII 

xm 


I III I 

X XIX I 




IXXK 
XXXX 


mix x 




XXXI 


mn i 




XXII 

nix 


mix i 
mix i 




nix 
ixxx 


mil i 
urn n 




xm immx 


in mi 

immxxim 


xmmmxii 


■immiiim i 


mxxmxmmm 


tmxxmmixmx 


mil 

XXXII 
IXXXX 

11111 

IIXXI 

mix 

XXIII 


nmiixmx 




mxxxxxmi 




mnmim 




immmii 




mimmmxixm 


mimiixxxxiinxi 

nxxmxmmmxx 
imimmimim 


X 
X 






(Iiiiimnmmmix 


1 1 


1 


1 



15 20 25 



(f ) 171.6 1, 



IIIIIXXI II XX 



iimiiiii xm 

mmmx mi 

■mmmxi mi 



xmmmmi 



mimmmmim 

xx ixxxmmmxxm 
xmmimmmm 



ximximm 

IIIIXXXXXXXXX 

mmxxmxx 



II11IIIIIII 



I 


(C) 171.9 ^ 


I 


I 


i 




I 


X X 




I 


X I 




II 

II 


I X 

I II 




II 
II 


1 II 




m 

m 


I XXI 




IXIX 

xm 


1 III 




mi 


I XII 1 




im 


mn i 




IIII 


mum 




mi 
mi 
mi 

IIXXI 

mil 
inn 

mix 


mum 




limiii 




imiiii 




limiii 




iixmmmm 


mn 

urn 

"in 
xxm 

IIXXI 
XXXXX 
III I X 

xxxxx 

mil 

mn 
ixxxx 
XXXII 
xxm 
XXXIII 


iimiinm 




iiiiiimm 




imiiimii 




miimim 
mimmii 




miimim x 




iimxiiim i 




immnm i 


X X 


I x 

miimim i 
immmii i 


rh 






xx ii mi m xxx xm 1 1 


XI 









10 15 20 25 



X 


eg) 


170.5 1^ 


I 


I 


I X 


I I 


> XX 

i ii 


i n 

I XX 


I II 


XI II 








XI IX 


ii 






■linn 
mini 


XX 

II 






mini i 


II 

XI 






imimi 


II 






miiiini 

mxmiii 


XX 

XI 


i 
i 




iimiiiii 
IXIXXXXXXI 


XI 
XX 


I 
I 




(iiimmm 

iimmmn 


n 

II 




(iiimmm 

IXXXIIIIIIIII 


IX 

ii 




(imiim 

minim 

mimm 
iimiiiii 
iimiiiii 


IXI 
XIX 


XIX 




"J 




■ II 




xxx 

.. 1 


■ II 




mimm 




XXX 


II 


iimmii 

XXXXXIII1I 

xmxmii 
mimm 
iimmii 

iimmii 
mimm 


IIII 

nil 

xxxx 




■11 


III 

III 


III 

III 

III 


«» 


III 


1 1 lie" 






1 


1 



10 



15 20 25 



15 20 25 



(d) 170.2 a. 



xx mn 



iixximmi 



minimum 



mimmii 



mmm 



20 



(h) 172.3 O 


■ 
I 


i 
I 


X 

i 


X X 
X X 


X XI 

I II 


I II 


II 11 

XI II 


IXX II 

III III 


III III 

XXXXXXI 


mini 
imiiii i 


imiim i 

mmm i 


imiim x 
mimm i 


iimmii i 

mimm x 


xminiii x 


imiimmii 


minimum 


IIIIXXIIIIIIIIII 

xiiimmmiii 


xxxiiiimiimiii 


iimmmimm 
imimmimiii 


iimmmimm 


mimimimm 

nnmimimm 

mimimimm 
immmiiimxi 

imimiiiiiiixix 


nxiiiixmxuxxxsxiixju _ 
mmmimmnxxi 

mmmimmimi 

xmimmiimmii 
imxxiimmmmx 
mm mini mmm 





10 15 20 25 



HARMONIC NUMBER 



Fig. 3 — Set of Fourier spectra for digit simulation No. 1 (as shown in Fig. 2). 
Each X represents a 1-db relative amplitude increment. Spectra are in alpha- 
betical order with respect to time. 

2842 



DIGIT-SIMULATING SPEECH SOUNDS 



2843 



1 


CL) 171.5 'V 




X 


I 


X 1 

» 1 


X I 


XX IX 
XX XX 
XX IX 

XI II 










m ii 

XXXXIX 


XXXXIX 
IIIIIXIX 


xixiixxxx 
iixiixiii 


X 

■ 




IIIII1IIII 

"ilium 


xx 




■IXII1IIII 
IIIIXIXXXX 
IIIIIIIXII 

IIIIIIIIII 

IIIIIIIIII 


XX 
XX 




XX X 
XX X 





■ I 1 




IX IX 




■IIIIIIIII! 

■IIIIIIIIII 

minimi 

IIIIIUIXII 


XXXII 








immim 
milium 

iiiiiiiim 

■ m.mm 
iimimii 

im.imu 

milium 


nun 
IIIIIII 




mm 

mm 


-, 


IIIIIII 




■IIIIIIIIII 

XXXXIIXXIXX 

XXXXXIXXXXX 

iimimii 

mmmu 
XXXXXIXXXXX 
IXXXIIXXXIX 

XXXXXIXXXXX 


XXIXXX X 

mm i 

XIIXII I X 

mm i xi 
mm i ii 

IIIIIIXXIII 
mmmu 


t 5 10 


15 20 25 



J 


j) 170.1 


'V/ 


x 


X 


X 

X 


I I 


X 1 
XX X 


XX X 

XX XX 


XX XX 
XX IX 


XX IX 
XXX XX 






xxxxxx 

XXXXIX X 


XXIXXXX X 
XXXXXXX X 


XXXXXXXXX 

xxxxxxxxx 


X 

i 




IIIIIIXXXI 

IXXXXXXXXX 


■ 




IIIIIIIIII 








IX X 




IIIIIIIIII 


III 1 




iixiiiiiii 


xxxxx 






xxxxx 




IXXXXIIXXI 


urn 

XXXXXXX 
XXXXXXX 





XIXIXXXXXX 

UIIIIIIU 
XXXXXXXXXXI 

XXXIIIXXXXI 


IIIIXXXXXXI 

immim 


XXXXXII 




milium 

milium 


IIIIIXX 

IXIXIII 


J_ 


xiiiiiiii 


XXXXXXXXXXI 

IIUUIIXII 


mum 

XIIIIIIX 


X 

XI 


IIIIIIXXXXIIIIIXIXX 

mmummum 


IIXXIIXXXXXXXXXIIXI 
IIIIIIIXXIIIIIXIIII 


IIIIIIIIII 


I1IIUU 


■ II 


1 1 1 

1 5 10 


1 I 

15 20 25 



HARMONIC NUMBER ' 

Fig. 3 — (continued) 



two sets of spectra corresponding to the two speech segments whose 
time domain waveforms appear in Fig. 2. 



V. DISCUSSION OF RESULTS 

Several aspects of the spectra shown in Figs. 3 and 4 are worthy of 
note. 

First of all, it is seen that these two speech segments (as well as the 
twelve others not shown here) do indeed satisfy the two properties 
anticipated in the introduction. The high degree of periodicity of these 
speech waveforms is spectrally confirmed by noting that in both se- 
quences of spectra the harmonic structure remains extraordinarily 
uniform. (This result also confirms, by hindsight, the original validity 
of a period-by-period Fourier analysis.) By noting the fundamental 
pitch (thus the period) of each segment, it is seen that this highly stable 
harmonic structure is maintained for at least the 23 milliseconds which 
coincides with the duration requirements of the TOUCH-TONE re- 
ceiver. 



J (a) 232.6^ 


x 




< 


< X 


X X 




X X 


X X 


XXX 




XXX 


XXX 


xxxx 




xxxxx 




III" 


( xxxxx x 


< xxxxxx X 


« XXXXXX XX 


I XXXXXX XX 


( XXXXXXXX XX X 


IXXXXXXXXX XX X 


IXXXXXXXXX XX X 


IXXXXXXXXXXXX X 


IXXXXXXXXXXXX X 


IXXXXXXXXXXXX X 


IXXXXXXXXXXXXXX 


(xxxxxxxxxxxxxx 




IXXXXXXXXXXXXXXX 


txxxxxxxxxxxxxxx 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


[XXXXXXXXXIIIIII 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


mxmxxxxxxxxx 


IXXXXXXXXXXXXXXX 


III 1 



; „ (e) 235. 2^ 


X X 


X X 
X X 


X X 


X X 


X X 

X < 


I X 


XXX 


X XXX 


I XXXXX X 
X XXXXX XI 


X XXXXX XI 


X XXIIXI XX 


I XIIIIIIIIIX I 


X IIIXXIIIIII I 


IXXXXXXIXIXIXXI 
XXXXIXXIIXIXIXX 

xixxxxxxxxxxxxx 

IXXXXXXXXXXXXXXX 
XXXXXXXXXXXXXIIX 

Txxxxxxxxxxxxxxx 

XIXXXXXXXXXXXXXX 
XXXXXXXXXXXXXXXX 
XXXI IXXXXXXXXX X X 

XXXXXXXXXXXXXXXX 
XXXXXXXXXXXXXXXX 
IIXIIXXXIIIIIIII 


IXXIIXXIIXXIIIXX 


1 1 1 1 1 



I 5 10 15 20 25 



» 


(b)234.7'\j 




X X 


X X 


X X 
X X 


XII 


III 
III 


III 

XXX 


xxxxx 

X IIIII 




X mix 


I 


i xxiiii 

X XXXXXX 


II 


I XXIIXI 
I XXXIII 


II I 


X XIIXXX 


II I 


X XXXXXXIX 
X XXXXXXXX 


II I 


X XXXXXXXI 
X XXXXXXXX 


II x 
IXXXI 


X XIIIIII) 


IXXXX 


I IIIIIIIIIIIII 


I XXIIXXXXXXXXX 

XXXXXXXXXXXXXXXX 


XXXXXXXXXXXXXXXX 


xxxxxxxxxi 

IXIIIIIXXX 


IXIIII 
XXXXXX 


XXXXXXXXXXXXXXXX 


IIIIIIIIII 


IXXXXX 


XXXXXXXXXI 
XXXXXXXXIX 
XXIXXIXXIX 

IIIIIIIIII 


XXXXXX 




XXXXXXXXXI 
XXXIXXXXXX 


XXXXXX 
xxxxxx 


XXXXXXXXXXXXXXXX 
XXXXXIXXIXIXXXIX 


XIIIIIIIXIIIIIII 

XXXXXXXXXXXXXXXX 


Iiixxxxxxa 


'"III 


1 1 1 


1 1 1 



10 15 20 25 I 



15 20 25 



(f) 237. 4 O 



IXXXI II 



IXXXXXXXXX 



xxxxxxxxxx 



(0 235.30 



xxxxxxxxxx 



-i — i — r 

10 15 20 25 





(g) 236.50. 


I I 








X I 








X X 








X X 








X X 




X X 




X X 












XXX 




X III 








X XII 


I 






I XIII 


X 






I XIII 


X 


X XXXXXX 


XX 


X XXXXXX 


XX 






X IXXIII 


XX X 


i mux 


II I 


IIIXXXXXI 


XX I 


XXII x I XX X 


XX X 


XXXIIXXIX 


XI I 




XX X 


xiixxxiiii 


mi 


xnxxxxm 


XXXX 


XXXIIIIIIX 


xxxx 


TxiTiiiiii 


xxxx 


XIIXXIIIIX 


xxxx 


X XXX I ■ XXXI 


XIII 


XXXXXXXXXX 


xxxx 


I XXXXXXXX! 


xxxx 


minim 


xxxx 


XXXXXXXXXX 


XXXI 


XXXXXXXXXX 


XXXI 


XXXXXXXXII 


IXXXX 




Iiixxxxxxa 


mil 


xxxxxxxxxx 


[XXXX 

xxxxx 


xxxxxxxxxx 


xmimiiixxii 


IXXXXXXXXX 


IXXXX 


IXXXXXXXXXXXXXX 


xiimxixi 


IXXXI 




IXXXX 


xxxxxxxxxx 


XXXXX 







15 20 25 
HARMONIC NUMBER 



15 20 25 



x 


(d) 236.9 <\j 


X X 


X X 


I X 


X X 




XII 


III 

xxx 


XXI X 


I XIIII 


" 


i mil 


XI 


X XXXXXX 


XX 


x mm 


XI I 

IX I 


i xxxxxx 

XXXXXXXX 


III X 


XXIIIXXX 


XXX X 


XXXXXXXX 


XXXXX 


mum 


xxm 


xxmm 


XXXXXX 


mum mux 


IIIIXI X I XX x IIXI 


XXXXXXXX 


XXXXXX 


IIXIIX 


x X XI I X1IIX I, IX. 


ximiiimmx 


XXXXXXXX 
XXXXXXXI 
XXXXIIXX 
XXXXXXXI 

XXXXXXXX 




XXXXXX 


mm 


II Ilium 







tO 15 20 25 



X 
X I 


CW 236.6 


X X 




X X 




X X 








I I 




X X 




I I 




I I 




I I 








I 'I 








xxx 








X XII 






X 




X 


I III 


X 


I xxxx 


X 




XI 


X xxxxx 


II 


I IIIII 


II 


I IIIII 


II 


I XXIIXI 


XX 


I xxxx** 


XI I 


i nun 


II I 


XXXXXXXX 


II 1 






I XXXI 


XXXXXXXX 


nun 


mum 


nun 




mm 


ilium 


mm 


XXXXXXXX 


XXXXXX 


xmmmxxiii 


XXXXXXXXXXXXXXXX 


xiiimmiimi 


xiixxmixxxxxxx 


XXXXXXXXXXXXXXXX 


XXXXXXXXXXXXXXXX 


XXIXXIXXXXXXXXXX 


IXXXXXXXXXXXXXXX 


xxxxmmmm 


XXXXXXXX 




xxxxxxx 




mxmiiiiixm 


XXXXXIIXXXXXXXXI 


XXXXXXXXXXXXXXXX 


xxximiimim 




XXXXXXXXXXXXXXXX 


I l 





15 20 25 



Fig. 4 — Set of Fourier spectra for digit simulation No. 2 (as shown in Fig. 2). 



2844 



(L) 236.80. 



u.nuunn 



15 20 25 



I . 
» X 


(m) 241.2<\j 


X I 




X X 




X X 








X X 




X X 




X I 




X X 




X X 




X X 




X X 








X XXX 




X XXX 




X XXX 




X XXX 




I XXX 




X XXX 




X XXX 






X 


i xxxx x 


x 


X XXXI X 


X 


X XIXX X 


XX 


XXXXXX X 


XX 


XXXXXX X 


XX 


XXXXXX X 


XX 


XXXXXX X 


xxxx 


xxxxxxxx 
xxxxxxxx 


xxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


xxxxxxxx 


xxxxx 


XXXIIIXXXXXXXX 


XXIXXXXXXXXXXXX 


xxxxxxxxxxxxxxx 


xxxxxxxx 

xxxxxxxx 


XXXXXX 


XXXXXX 


XXIXXXIIXXXIXXX 


xxxxxxxxxxxxxxx 


xxxxxxxxxxxxxxx 


xxxxxxxxxxxxxxx 


IXIIIXXXXXXXXXX 


XIXIXIIIIIIIXII 


xxxxxxxx 

xxxxxxxx 


XXXXXX 




xxxxxxxx 


(XXXXX 


xxxxxxxxxxxxxxx 


xxxxxxxxxxxxxxx 




1 1 1 



10 15 20 25 



(j) 239.3 -Aj 



10 15 20 25 



X X 


(n) 241. 40i 


X X 




X I 




X < 








X X 




X X 




I X 




X < 




X X 




X X 




XXI 




XII 




XXI 




XIX 




X XXI 




X XII 




X XX" 


X 


I III 


I 


I II" 


I 


X XXXI 


X 


X XXXI X 


I 


X IXXX I 


I 


X XXXI X 


II 


I III! I 


XX 


I XXIXII 


XI 


I IIIXII 

I "11" 


IX 


XX 


xxxxxxxx 


I XIX 


IIXIXIIX 


X XXX 


xxxxxxxx 


X XXX 


xxxxxxxx 


X XXX 


xxxxxxxx 


X XXX 


xxxxxxxx 


X XXX 


xxxxxxxx 


mil 


xxxxxxxx 


mxxi 




XXXXXX 


XIIXXXII 


XXXXXX 


mlffm 


xxxxxx 


XXXXXX 


XIIIIXII 


IXXIII 


IIIIIXIIIXIIIII 




xxxxxxxx 


XXXXXX 


xxxxxxxx 


XXXI XI 


mm 


minimum 


XIIII1XIIIIXXII 


XXIIIIIIIXIIIII 


IXXXXIXI 

1. ..... , 


mm 




IIXIXIIX 


mm 


xxxxxxxxxxxxxxx 



, * (K)239.10i 


x \ 


X X 


I X 


X X 


X I 


X X 


XIX 


XXX 


I III \ 


x mi x x 


i mi i i 

XXXXXX I I 


mm i x 


xxxxxxxx XX 

XXXXXXXX XX 


XXXXXXXX II 

IXXXXIXI XXI 


XXXXXXXX III 


ixxxxxxx mx 

xxxxxm im 


IXXXXXXX xxxxx 

ixxxxxix mm 


xmmi mm 


xixxxiix mm 
iixxiiii mm 


ixxxxnx mm 

XXXXXXXX XXXXXX 


ixxiii mm 
xmmimiiii 


xmmmmii 

xxxxxxxxxxxxxxx 


IXIIIXXXIIIIIII 


Ilillllllliim 


IIIIIXIIIXIIIII 


imxxxxxxxxixx 


xmmmmix 





, J (l) 239.10; 




















i mx x 


x ml I 


i mi x x 


XIIIIIIX I XXI 


IIIIIIII X III 


mmiiiiixxi 


■ ximiixxxxixx 


"lIIIIXIXXIXXX 


IIIIIIIIIIIIIII 


XXIXIXIXXIXXXIX 


Imimixxxm 


mmmxmxx 


Illlliuiulm 


immmiixii 


iiimmilim 


IIIIIIXIIIXIXII 


IIIIIIIIIIIIIII 


IIIXIIIIIXIIIXI 





15 20 25 



10 15 20 25 



X X 


(0) 240.8O1 


X X 


X X 


X X 


X I 


X X 

X X 


X X 


III 


X XXX 


X III 


i mi 


J 


mm 


X 


iiiimi 

xxxxxxxx 

xxxxxxxx 


X 


J 


xxxxxxxx 

XIIIIIIX 


X II 


XXXXXXXX 


X XI 


IXXXXIIX 

IIIIIIII 


I II 


XXIXXXXX 
XXXXXXXX 


IXIIII 


xxxxxxxx 


XXXXXX 


XXXXXXXX 


XXXXXX 


IXXXXXXX 


""x" 


IIIXXXXX 


IXXXII 


IIIIIIII 

IXXXXXXX 


XXXXII 


IIIIIIII 


IIIXII 


IIXXIIII 


IIXXII 

IXXIII 


XXXXXXXX 

IIIIIIII 


xxixiimimn 

IXXXXIIXXXXXXXX 


xximxxmmi 







15 20 25 15 
HARMONIC NUMBER 



10 15 20 25 



Fig. 4 — (continued) 
2845 



2846 THE BELL SYSTEM TECHNICAL JOURNAL, NOVEMBER 1963 

Secondly, one finds immediate justification for the fact that these 
speech segments caused digit simulation. By multiplying the funda- 
mental pitch of any segment by the orders of its two dominant har- 
monics, a valid TOUCH-TONE calling signal* is derived. Thus a voice- 
produced digit simulation is spectrally analogous to a valid TOUCH- 
TONE signal accompanied by noise, with the sole exception that in the 
former case both the "noise" and signal components are integral mul- 
tiples of a discrete fundamental frequency. Indeed, this sole distinction 
between a digit simulation and a valid signal might possibly be used to 
provide further simulation protection in future voice-frequency signal- 
ing applications. Specifically, a receiver might be designed to be sensitive 
to the presence of selected harmonics and/or sub-harmonics of valid 
signal frequencies, and thereby to reject many speech phonemes which 
would ordinarily cause simulation. 

In the portions of Figs. 3 and 4 where the harmonic structure is notice- 
ably changing with time (namely at the beginning and end of each series 
of spectra) pitch-synchronous Fourier analysis can be regarded as only 
an approximation of spectral density. For some applications, however, 
the approximation is still useful. In the first place, one can obtain a 
practical "feel" for the rate of change of pitch and harmonic structure 
in vowel- type speech sounds. Also, from the standpoint of digit simula- 
tion, by examining the spectra one can ascertain just how and when a 
speech segment becomes a digit simulation. For example, in the early 
spectra of Fig. 3, although pitch requirements for digit simulation are 
satisfied, the 10th harmonic competes with the 7th harmonic for limiter 
capture, and receiver recognition is prevented by limiter guard action 
(i.e., insufficient signal-to-noise ratio). (See Ref. 1, pp. 10-11, 13.) 

On the other hand, although early spectra of Fig. 4 show an acceptable 
harmonic structure for digit simulation, the pitch is slightly too low for 
receiver recognition. In a similar manner, one can determine how and 
when a digit simulating wave-form starts to degenerate. 

Admittedly, the speech segments chosen here are both rare and few in 
number. Thus, one cannot draw conclusions of statistical significance 
from this study. However, there is no reason to believe that any other 
group of frequencies of the same capacity in the voice band would not 
be simulated by the voice about as often as were these TOUCH-TONE 
calling frequencies. Therefore, such vowel-type speech segments may be 

* A valid TOUCH-TONE calling signal consists of one frequency from each of 
two groups: a low group — 697, 770, 852 and 941 cps ±2.5 per cent — and a high 
group — 1209, 1336, 1477 and 1633 cps ±2.5 per cent. 



DIGIT-SIMULATING SPEECH SOUNDS 2847 

looked upon as potential digit simulations in almost any proposed voice- 
frequency signaling application. 

REFERENCES 

1. Battista, R. N., Morrison, C. G., and Nash, D. H., Signaling System and Re- 

ceiver for TOUCH-TONE Calling, Trans. IEEE, 66, March, 1963. 

2. Mathews, M. V., Miller, Joan E., and David, E. E., Pitch Synchronous Analysis 

of Voiced Sounds, J. Acoust. Soc. Am., 33, Feb., 1961, pp. 179-186.